Enhanced Spatial Stream of Two-Stream Network Using Optical Flow for Human Action Recognition
نویسندگان
چکیده
Introduction: Convolutional neural networks (CNNs) have maintained their dominance in deep learning methods for human action recognition (HAR) and other computer vision tasks. However, the need a large amount of training data always restricts performance CNNs. Method: This paper is inspired by two-stream network, where CNN deployed to train network using spatial temporal aspects an activity, thus exploiting strengths both achieve better accuracy. Contributions: Our contribution twofold: first, we deploy enhanced stream, it demonstrated that models pre-trained on larger dataset, when used yield good instead entire model from scratch. Second, dataset augmentation technique presented minimize overfitting CNNs, increase size performing various transformations images such as rotation flipping, etc. Results: UCF101 standard benchmark videos, our architecture has been trained validated it. Compared with networks, results outperformed them terms
منابع مشابه
Two-Stream Convolutional Networks for Action Recognition in Videos
We investigate architectures of discriminatively trained deep Convolutional Networks (ConvNets) for action recognition in video. The challenge is to capture the complementary information on appearance from still frames and motion between frames. We also aim to generalise the best performing hand-crafted features within a data-driven learning framework. Our contribution is three-fold. First, we ...
متن کاملHidden Two-Stream Convolutional Networks for Action Recognition
Analyzing videos of human actions involves understanding the temporal relationships among video frames. CNNs are the current state-of-the-art methods for action recognition in videos. However, the CNN architectures currently being used have difficulty in capturing these relationships. State-of-the-art action recognition approaches rely on traditional local optical flow estimation methods to pre...
متن کاملTwo-Stream SR-CNNs for Action Recognition in Videos
Human action is a high-level concept in computer vision research and understanding it may benefit from different semantics, such as human pose, interacting objects, and scene context. In this paper, we explicitly exploit semantic cues with aid of existing human/object detectors for action recognition in videos, and thoroughly study their effect on the recognition performance for different types...
متن کاملTwo-Stream convolutional nets for action recognition in untrimmed video
We extend the two-stream convolutional net architecture developed by Simonyan for action recognition in untrimmed video clips. The main challenges of this project are first replicating the results of Simonyan et al, and then extending the pipeline to apply it to much longer video clips in which no actions of interest are taking place most of the time. We explore aspects of the performance of th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Applied sciences
سال: 2023
ISSN: ['2076-3417']
DOI: https://doi.org/10.3390/app13148003